Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

[PROPOSAL] Switch to pytest style test classes, use plain asserts #4204

Merged
merged 10 commits into from
May 7, 2020

Conversation

epwalsh
Copy link
Member

@epwalsh epwalsh commented May 7, 2020

As far as I can tell, since we're using pytest to run tests, there's no benefit of using unittest.TestCase as the base class for AllenNlpTestCase as opposed to plain pytest style classes.

There is, however, several disadvantages:

  1. @pytest.mark.parametrize does not work with methods on a unittest.TestCase class. Given that parametrizing tests is such a powerful tool, I think this is a major issue.
  2. unittest.TestCase assertion methods (self.assertEqual, self.assertSetEqual, etc) are less concise than plain assert statements, which are recommended by pytest.
  3. The setUp and tearDown methods do not have PEP-8 compatible names, and it's not clear when these methods are run based on their names. On the other hand, the pytest equivalents (setup_method and teardown_method or setup_class and teardown_class) have snake-case names that are perfectly clear as to when they run.

I know this looks like a huge PR but it's mostly just replacing unittest assertion methods with the pytest equivalents.

@epwalsh
Copy link
Member Author

epwalsh commented May 7, 2020

The "Check Models" workflow will fail until the unittest assertion methods in the models repo are replaced with the pytest equivalents.

_available_devices = ["cpu"] + (["cuda"] if torch.cuda.is_available() else [])
multi_device = parametrize(("device",), [(device,) for device in _available_devices])

multi_device = lambda f: pytest.mark.parametrize("device", _available_devices)(pytest.mark.gpu(f))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pytest.mark.gpu(f) ensures these tests are ran in the "GPU Checks" workflow

@dirkgr
Copy link
Member

dirkgr commented May 7, 2020

I have been wanting to do this for a while. It's especially confusing in the models repo, because AllenNlpTestCase supplies some functions that don't work there.

Copy link
Member

@dirkgr dirkgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit different than what I thought you were doing. I thought you were changing everything to top-level functions named test_*, with common setup and tear-down in pytest fixtures. The base class AllenNlpTestCase would then go away. Any reason not to do that?

self.assertEqual(
expected_output,
actual_output,
assert expected_output == actual_output, (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this give me those pretty comparisons, that show immediately where the difference is, even in large data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sir!

@epwalsh
Copy link
Member Author

epwalsh commented May 7, 2020

This is a bit different than what I thought you were doing. I thought you were changing everything to top-level functions named test_*, with common setup and tear-down in pytest fixtures. The base class AllenNlpTestCase would then go away. Any reason not to do that?

I think having a base class is beneficial because it wraps up all of its fixtures. Without it, you would have to import the fixtures (the temp directory and whatever else the base class comes with) to each test module or to a conftest.py at the same directory level as the test module, because there is no way of declaring global fixtures that work for all test modules without explicitly importing them.

That said, if you don't need those fixtures then it's just unnecessary overhead. So we could probably just use def test_* functions in many places.

@@ -116,7 +116,9 @@ jobs:
ALLENNLP_VERSION_OVERRIDE: "" # Don't replace the core library.
run: |
git clone https://github.com/allenai/allennlp-models.git
cd allennlp-models && pip install --upgrade --upgrade-strategy eager -e . -r dev-requirements.txt
cd allennlp-models
git checkout test-case # TODO remove this line
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just putting this here temporarily to make sure the models tests pass on that branch

Copy link
Contributor

@matt-gardner matt-gardner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I would vote strongly against moving everything to top-level test methods, by the way. In addition to what @epwalsh said, it makes model tests more annoying.

@@ -65,25 +66,25 @@ def test_sequence_label_field_raises_on_incorrect_type(self):
with pytest.raises(ConfigurationError):
_ = SequenceLabelField([[], [], [], [], []], self.text)

def test_class_variables_for_namespace_warnings_work_correctly(self):
def test_class_variables_for_namespace_warnings_work_correctly(self, caplog):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: caplog is not a great variable name.

Copy link
Member Author

@epwalsh epwalsh May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually an out-of-the-box pytest fixture so unfortunately I think we're stuck with that name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hmm, didn't realize that pytest did this magic. Ok 🤷‍♂️

self.model = torch.nn.Sequential(torch.nn.Linear(10, 10))

def test_reduce_on_plateau_error_throw_when_no_metrics_exist(self):
with self.assertRaises(ConfigurationError) as context:
with pytest.raises(ConfigurationError) as context:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just be something like match="learning rate ...", instead of as context, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Just updated

Copy link
Member

@schmmd schmmd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me!

@epwalsh epwalsh merged commit 82bf58a into allenai:master May 7, 2020
@epwalsh epwalsh deleted the test-case branch May 7, 2020 21:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants